Skip to content

Conversation

@roomote
Copy link

@roomote roomote bot commented Nov 5, 2025

Description

This PR addresses Issue #9048 regarding slow codebase indexing by explicitly excluding .git directories from file listing operations.

Problem

Users reported slow codebase indexing, specifically mentioning that large .git folders were being processed. While .git was already in the DIRS_TO_IGNORE list, there were scenarios where it could still be included:

  • When explicitly targeting hidden directories or directories in the ignore list, the code would use --no-ignore-vcs and --no-ignore flags
  • These flags would override the normal .gitignore behavior, potentially allowing .git contents to slip through

Solution

  • Added explicit .git exclusion as a critical priority pattern that is ALWAYS applied, regardless of other settings
  • This ensures .git is excluded even when --no-ignore-vcs and --no-ignore flags are used for other purposes
  • Applied the exclusion in both recursive and non-recursive modes

Changes

  1. src/services/glob/list-files.ts

    • Added explicit -g '!**/.git/**' pattern in recursive mode
    • Added explicit -g '!.git' and -g '!.git/**' patterns in non-recursive mode
    • These patterns are applied before other exclusions to ensure priority
  2. src/services/glob/tests/list-files-git-exclusion.spec.ts

    • Added comprehensive tests covering:
      • .git exclusion in recursive mode
      • .git exclusion in non-recursive mode
      • .git exclusion when targeting hidden directories
      • .git exclusion when .git itself is the target directory

Impact

  • Prevents indexing of .git folder contents which can be very large in repositories with extensive history
  • Improves codebase indexing performance
  • No breaking changes - this is a performance optimization

Testing

  • All existing tests pass ✅
  • New tests added specifically for .git exclusion ✅
  • Tests verify that .git is excluded in various scenarios

Notes

While .git was already in DIRS_TO_IGNORE, this explicit exclusion provides an extra layer of protection and ensures consistent behavior across all code paths. The user also mentioned wanting to exclude "these folders and files" (plural), suggesting they may want a more general exclusion mechanism. If needed, we can explore adding a configuration option for custom exclusions in a follow-up PR.

Fixes #9048


Important

Explicitly excludes .git directories from indexing in list-files.ts to improve performance, with comprehensive tests in list-files-git-exclusion.spec.ts.

  • Behavior:
    • Explicitly excludes .git directories from indexing in list-files.ts for both recursive and non-recursive modes.
    • Ensures .git exclusion even when --no-ignore-vcs and --no-ignore flags are used.
  • Implementation:
    • Adds -g '!**/.git/**' in recursive mode and -g '!.git', -g '!.git/**' in non-recursive mode in list-files.ts.
    • Applies exclusion patterns before other exclusions for priority.
  • Testing:
    • Adds list-files-git-exclusion.spec.ts to test .git exclusion in various scenarios, including recursive, non-recursive, and when targeting hidden directories.

This description was created by Ellipsis for a59ca7b. You can customize this summary. It will automatically update as commits are pushed.

- Added explicit .git exclusion in ripgrep arguments for both recursive and non-recursive modes
- This prevents indexing of .git folder contents which improves performance
- Added comprehensive tests to verify .git exclusion in various scenarios

Fixes #9048
@roomote roomote bot requested review from cte, jr and mrubens as code owners November 5, 2025 09:53
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 5, 2025
@roomote
Copy link
Author

roomote bot commented Nov 5, 2025

See this task on Roo Code Cloud

Review complete. No issues found.

The changes correctly add explicit .git directory exclusion to improve codebase indexing performance. The implementation is sound, well-tested, and follows existing patterns.

Mention @roomote in a comment to trigger your PR Fixer agent and make changes to this pull request.

@dosubot dosubot bot added the bug Something isn't working label Nov 5, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Nov 5, 2025
@daniel-lxs daniel-lxs closed this Nov 5, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Nov 5, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Codebase indexing seems to be very slow

4 participants